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Abstract — We study the maximal achievable rate R*{n, e) for a 
given block-length n and block error probability e over Rayleigh 
block-fading channels in the noncoherent setting and in the finite 
block-length regime. Our results show that for a given block- 
length and error probability, R* (n, e) is not monotonic in the 
channel's coherence time, but there exists a rate maximizing 
coherence time that optimally trades between diversity and cost 
of estimating the channel. 

I. Introduction 

It is well known that the capacity of the single-antenna 
Rayleigh-fading channel with perfect channel state information 
(CSI) at the receiver (the so-called coherent setting) is inde- 
pendent of the fading dynamics [1]. In practical wireless sys- 
tems, however, the channel is usually not known a priori at the 
receiver and must be estimated, for example, by transmitting 
training symbols. An important observation is that the training 
overhead is a function of the channel dynamics, because the 
faster the channel varies, the more training symbols are needed 
in order to estimate the channel accurately [2]-[4]. One way 
to determine the training overhead, or more generally, the 
capacity penalty due to lack of channel knowledge, is to 
study capacity in the noncoherent setting, where neither the 
transmitter nor the receiver are assumed to have a priori 
knowledge of the realizations of the fading channel (but both 
are assumed to know its statistics perfectly) [5]. 

In this paper, we model the fading dynamics using the well- 
known block-fading model [6]-[8] according to which the 
channel coefficients remain constant for a period of T symbols, 
and change to a new independent realization in the next period. 
The parameter T can be thought of as the channel's coherence 
time. Unfortunately, even for this simple model, no closed- 
form expression for capacity is available to date. A capacity 
lower bound based on the isotropically distributed ( i.d.) unitary 
distribution is reported in [6]. In [7]-[9], it is shown that 
capacity in the high signal-to-noise ratio (SNR) regime grows 
logarithmically with SNR, with the pre-log (defined as the 
asymptotic ratio between capacity and the logarithm of SNR 
as SNR goes to infinity) being 1 — 1/T. This agrees with 
the intuition that the capacity penalty due to lack of a priori 
channel knowledge at the receiver is small when the channel's 
coherence time is large. 
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In order to approach capacity, the block-length n of the 
codewords must be long enough to average out the fading 
effects (i.e., n ^ T). Under practical delay constraints, how- 
ever, the actual performance metric is the maximal achievable 
rate R*{n,e) for a given block-length n and block error 
probability e. By studying R* {n, e) for the case of fading 
channels and in the coherent setting, Polyanskiy and Verdu 
recently showed that faster fading dynamics are advantageous 
in the finite block-length regime when the channel is known to 
the receiver [10], because faster fading dynamics yield larger 
diversity gain. 

We expect that the maximal achievable rate R* (n, e) over 
fading channels in the noncoherent setting and in the finite 
block-length regime is governed by two effects working in 
opposite directions: when the channel's coherence time de- 
creases, we can code the information over a larger number 
of independent channel realizations, which provides higher 
diversity gain, but we need to transmit training symbols more 
frequently to learn the channel accurately, which gives rise to 
a rate loss. 

In this paper, we shed light on this fundamental tension 
by providing upper and lower bounds on , e) in the 

noncoherent setting. For a given block-length and error prob- 
ability, our bounds show that there exists indeed a rate- 
maximizing channel's coherence time that optimally trades 
between diversity and cost of estimating the channel. 

Notation: Uppercase boldface letters denote matrices 
and lowercase boldface letters designate vectors. Uppercase 
sans-serif letters (e.g., Q) denote probability distributions, 
while lowercase sans-serif letters (e.g., r) are reserved for 
probability density functions (pdf). The superscripts ^ and ^ 
stand for transposition and Hermitian transposition, respec- 
tively. We denote the identity matrix of dimension T x T 
by It; the sequence of vectors {ai,...,a„} is written as 
a". We denote expectation and variance by E[-] and Var[-], 
respectively, and use the notation Ex[-] or Ep^ [•] to stress that 
expectation is taken with respect to x with distribution Px. 
The relative entropy between two distributions P and Q is 
denoted by i:»(P||Q) [11, Sec. 8.5]. For two functions f{x) 
and g{x), the notation f{x) ~ 0{g{x)), x — > oo, means that 
limsup^_j.o2|/(x)/g(x)| < oo, and f{x) ~ o{g{x)), a; — > oo, 
means that limx^oD\f{x)/g{x) \ = 0. Furthermore, CA/'(0, R) 
stands for the distribution of a circularly-symmetric com- 



plex Gaussian random vector with covariance matrix R, and 
Gamma(Q;,/3) denotes the gamma distribution [12, Ch. 17] 
with parameters a and /3. Finally, log(-) indicates the natural 
logarithm, r(-) denotes the gamma function [13, Eq. (6.1.1)], 
and V'(') designates the digamma function [13, Eq. (6.3.2)]. 

II. Channel Model and Fundamental Limits 

We consider a single-antenna Rayleigh block-fading channel 
with coherence time T. Within the lih coherence interval, the 
channel input-output relation can be written as 



(1) 



where x; and y/ are the input and output signals, respectively, 
w/ ^ C7V(0,It) is the additive noise, and s; ^ C7V(0, 1) 
models the fading, whose realization we assume is not known 
at the transmitter and receiver (noncoherent setting). In ad- 
dition, we assume that {s/} and {w;} take on independent 
realizations over successive coherence intervals. 

We consider channel coding schemes employing codewords 
of length 71 = LT. Therefore, each codeword spans L 
independent fading realizations. Furthermore, the codewords 
are assumed to satisfy the following power constraint 

L 



Y.\\M?<LTp. 



(2) 



Since the variance of s; and of the entries of w; is normalized 
to one, p in (2) can be interpreted as the SNR at the receiver 
Let R*{n,e) be the maximal achievable rate among all 
codes with block-length n and decodable with probability of 
error e. For every fixed T and e, we have' 



lim R*{n,e) = C{p) = ;^sup/(x;y) 

n—>oo ± p 



(3) 



where C{p) is the capacity of the channel in (1), /(x;y) 
denotes the mutual information between x and y, and the 
supremum in (3) is taken over all input distributions Px that 
satisfy 

E[||xj|2] <rp. (4) 

No closed-form expression of C{p) is available to date. 
The following lower bound L[p) on C{p) is reported in [6, 
Eq. (12)] 

m = ^{iT- 1) log(Tp) - logr(T) -T+ '^^^p^ 



^/^e-«^(r-i,TH(i + ^ 



T-1 



\og{u^-^^[T - l,Tpu)) du (5) 



where 



1 



r(n) Jo 

denotes the regularized incomplete gamma function. The input 
distribution used in [6] to establish (5) is the i.d. unitary distri- 
bution, where the input vector takes on the form x = Ux 

'The subscript I is omitted wtienever immaterial. 



with Ux uniformly distributed on the unit sphere in C . We 
shall denote this input distribution as Px^''. It can be shown 
that L{p) is asymptotically tight at high SNR (see [7, Thm. 4]), 
i.e., 

Cip)=L{p)+o{l), p^cjo. 
III. Bounds ON i?*(ri, e) 

A. Perfect-Channel-Knowledge Upper Bound 

We establish a simple upper bound on R*{n, e) by assuming 
that the receiver has perfect knowledge of the realizations of 
the fading process {s;}. Specifically, we have that 

i?*(n,e)<i?,V(n,e) (6) 



where R*^i-^{n,e) denotes the maximal achievable rate for a 
given block-length n and probability of error e in the coherent 
setting. 

By generalizing the method used in [10] for stationary 
ergodic fading channels to the present case of block-fading 
channels, we obtain the following asymptotic expression 
for R: 



V'coh(p) 



00. 



(7) 



Here, Ccoh(p) is the capacity of the block-fading channel in 
the coherent setting, which is given by [1, Eq. (3.3.10)] 



(8) 



Ccoh(p) = E4log(l 
Q(x) = vl^^"*"^^'^'^ denotes the Q-function, and 

V^coh(p) = TVar[log(l + p\s\^)] + 1 - —1- 

[ 1 + p\s 

is the channel dispersion. Neglecting the o(1/y^) term in (7), 
we obtain the following approximation for Rl^y^{n, e) 

RUin, e) « Ceoh(p) - \f^^Q-He). (9) 



It was reported in [14], [15] that approximations similar to 
(9) are accurate for many channels for block-lengths and error 
probabilities of practical interest. Hence, we will use (9) to 
evaluate R*Qi^{n, e) in the remainder of the paper. 

B. Upper Bound through Fano 's inequality 

Our second upper bound follows from Fano's inequality [11, 
Thm. 2.10.1] 

C{p) + H{e)/n 



R*{n,e) < 



(10) 



1 - e 

where H{x) = — xlog.T — (1 — a;)log(l — x) is the binary 
entropy function. Since no closed-form expression is available 
for C{p), we will further upper-bound the right-hand side 
(RHS) of (10) by replacing C{p) with the capacity upper 
bound we shall derive below. 

Let Py I X denote the conditional distribution of y given 
X, and Py denote the distribution induced on y by the 
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Fig. 1. U{p) in (17), L{p) in (5) and Ccoh(p) in (8) as a function of tiie 
cliannel's colierence time T, p = 10 dB. 



input distribution Px through (1). Furthermore, let Qy be an 
arbitrary distribution on y with pdf qy(y). We can upper- 
bound /(x;y) in (3) by duality as follows [16, Thm. 5.1]: 

/(x;y) <E[Z?(Py|x||Qy)] 

= -Ep^[logqy(y)]-h(y|x). 

Since 



(11) 
(12) 



rp-E[||x||2] > 

for every Fx satisfying (4), we can upper bound C (p) in (3) 
by using (11) and (12) to obtain 

C{p) < I- inf sup{-Ep^[logqy(y)] 

1 A>0 p 

-h(y|x) + A(rp-E[||xf])}. (13) 

The same bounding technique was previously used in [17] to 
obtain upper bounds on the capacity of the phase-noise AWGN 
channel (see also [18]). 

We next evaluate the RHS of (13) for the following pdf 



qy(y) 



r(r)||y| 



2(1-T) 



-llyllV[T(p+i)] 



nTT{p+l) 

Thus, y is i.d. and ||y|p ^ Gamma(l,r(l + p)). Substitut 
ing (14) into Ep [logqy(y)] in (13), we obtain 

-Ep^[logqy(y)] 

r(l + pK^ T + E[||xf] 



log 



r(T) T{p + i) 

(T-l)E[l0g((l + !|x!|2)2i+Z2)] 



log 



T(l + p)^^ 



r(T) 



+ {T- l)iiT - 1) 



-E 



^ ' k + T-1 

fe=0 



T{l + p) 



(15) 



The first equality in (15) follows because the random variable 
||y|p is conditionally distributed as (1 + ||x|p)2;i + Z2 given 
X, where zi ^ Gamma(l, 1) and Z2 ^ Gamma(r — 1, 1). 



Substituting (15) into (13), and using that the differential 
entropy h(y | x) is given by 



h(y I x) = Ex [log(l + llxf)] + riog(^e) 



we obtain 



C{p) < ^ + ;^ inf supJe 

I 1 A>0 



E 

Lfc=o 



(T-l)(l + l/||x|| 



log 1 



T{l + p) 



k + T-1 
-X{Tp-\\^ 



(16) 



< ^ 



1 . , j^(r-i)(i + i/||xf) 



.k=0 



log(l + i|x|p) + 



ni+p) 



k + T-1 



+ A(rp-||x|| 



Uip) 



(17) 



where 



P 



1 



+ (r-i)vXT-i). 



To obtain (a), we upper-bounded the second term on the RHS 
of (16) by replacing the expectation over ||x|| by the supremum 
over ||x||. 

The bounds L{p) and U{p) are plotted in Fig. 1 as a 
function of the channel's coherence time T for SNR equal to 
10 dB. For reference, we also plot the capacity in the coherent 
setting [Ccoh(p) in (8)]. We observe that U{p) and L{p) are 
surprisingly close for all values of T. 

At low SNR, the gap between U{p) and L{p) increases. In 
this regime, U{p) can be tightened by replacing qy(y) in (13) 
by the output pdf induced by the i.d. unitary input distribution 
Px^', which is given by 



yeC''. (14) qW(y) 



e-llyllV(i+Tp)||y||2(i-T)p(2.) 



X 7 r- 1 



ttT{1 + Tp) 

Tp\\yr 



1 + Tp 



1 

Tp 



T-l 



1 + — . (18) 



Substituting (17) into (10), we obtain the following upper 
bound on R* {n, e): 

("iC) < ii{n,e) = J . (19) 

C. Dependence Testing (DT) Lower Bound 

We next present a lower bound on R*{n,e) that is based 
on the DT bound recently proposed by Polyanskiy, Poor, and 
Verdii [14]. The DT bound uses a threshold decoder that 
sequentially tests all messages and returns the first message 
whose Ukelihood exceeds a pre-determined threshold. With 
this approach, one can show that for a given input distribution 



Pxi, there exists a code with AI codewords and average 
probability of error not exceeding [14, Thm. 17] 

M 



e < Ep 



where 



y-f- I X 

M - 1 



z(x^y^) 



< log- 



i x^;y^ > log- 



M - 1 



log 



Py^-ix^ (y^ |x^) 



Py^(y^) 



(20) 



(21) 



is the information density. Note that, conditioned on x^, the 
output vectors y;, Z = 1, . . . , L, are independent and Gaussian 
distributed. The pdf of y; is given by 



Py|x(yi |x;) 

exp(-yf (It 



Xixf)-iyO 



(a) 



TT^ det(Ir +x/xf ) 
1 



- cxp 



\yi\ 



|yfx,| 



1 



|x/| 



(22) 



^^(l + l|xi|P) 

where (a) follows from Woodbury's matrix identity [19, p. 19]. 

To evaluate (20), we choose x;, I = l,...,L, to be 
independently and identically distributed according to the i.d. 
unitary distribution Px^' . The pdf of the corresponding output 
distribution is equal to 



qi?(y^) 



1=1 



where qy^^(-) is given in (18). Substituting (22) and (18) into 
(21), we obtain 



where 



log 



:x^y^)-5]z(x;;y0 

1=1 

r(T) 



(23) 



1 



x; 



l + Tp 



(T-l)log 



Tp\\yi\ 



log7 T-l, 



l + Tp 

Tp\\yi\? 



log(l 



x; 



l + Tp 



Due to the isotropy of both the input distribution P^Y"* ™d 
the output distribution QyT, the distribution of the information 
density i(x^; y^) depends on P^^^ only through the distribu- 
tion of the norm of the inputs x; . Furthermore, under P^Y' > we 
have that ||x;|| = \/Tp with probabiHty 1, / = 1, . . . ,Z. This 
allows us to simplify the computation of (20) by choosing 
an arbitrary input sequence x/ = x = \^fTp^ 0, • . • , 0]"^, 
I = 1, . . . , L. Substituting (23) into (20), we obtain the desired 
lower bound on R*{n, e) by solving numerically the following 
maximization problem 

R{n, e) = max | - log M : M satisfies (20) I . (24) 




Fig. 2. Bounds on maximal acliievable rate R*{n,e) for noncoiierent 
Rayleigii block-fading channels; p = 10 dB, T = 50, e = 10"'''. 



The computation of the DT bound R{n, e) becomes difficult 
as the block-length n becomes large. We next provide an 
approximation for R{n, e), which is much easier to evaluate. 
As in [15, App. A], applying Berry-Esseen inequality [14, 
Thm. 44] to the first term on the RHS of (20), and applying 
[20, Lemma 20] to the second term on the RHS of (20), we 
get the following asymptotic expansion for R{n, e) 



R{n,e)^L{p)- 
with V_{p) given by 



^^Q-\e) + o(-],n^oo (25) 



V{P) = ^Ep,u, [Var[z(x;y) | x]] = lvar[z(x;y)] 

where, as in the DT bound, we can choose x = 
[vTp, 0, . . . , 0]^. By neglecting the 0{l/n) term in (25), we 
arrive at the following approximation for e) 



5(n,e)«L(p)- 



np)^-i 



(26) 



Although the term V_{p) in (26) needs to be computed numer- 
ically, the computational complexity of (26) is much lower 
than that of the DT bound R{n^ e). 

D. Numerical Results and Discussions 

In Fig. 2, we plot the upper bound _R(n, e) in (19), the lower 
bound e) in (24), the approximation of R{n, e) in (26), 
and the approximation of ^coii("'^) ™ (9) ^ function of 
the block-length n for T = 50, e = lO'^ and p = 10 dB. For 
reference, we also plot the coherent capacity Ccoh(p) in (8). As 
illustrated in the figure, (26) gives an accurate approximation 
of R{n, e). 

In Figs. 3 and 4, we plot the upper bound R{n, e) in (19), the 
lower bound R{n, e) in (24), the approximation of R*ai^{n, e) 
in (9), and the coherent capacity Ccoh{p) in (8) as a function of 
the channel's coherence time T for block-lengths n = 4 x 10^ 
and n = 4 X 10^, respectively. We see that, for a given 



Cc„h(p) in (8) 



2.8 - 
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Channel's coherence time, T 

Fig. 3. R{n,e) in (19), R{n,e) in (24), approximation of R*^^^{n,€) 
in (9), and Ccoh(p) in (8) at block-lengtli n = 4 X 10^ as a function of 
the cliannel's coherence time T for the noncoherent Rayleigh block-fading 
channel; p = 10 dB, e = 10'^. 
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Channel's coherence time, T 

Fig. 4. R{n,e) in (19), R{n,e) in (24), approximation of Rl^^in^e) 
in (9), and Ccoh(p) in (8) at block-length n = 4 X 10* as a function of 
the channel's coherence time T for the noncoherent Rayleigh block-fading 
channel; p = 10 dB, e = IQ-^ . 

block-length and error probability, R* (n, e) is not monotonic 
in the channel's coherence time, but there exists a channel's 
coherence time T* that maximizes R*{n,e). This confirms 
the claim ■we made in the introduction that there exists a 
tradeoff between the diversity gain and the cost of estimating 
the channel ■when communicating in the noncoherent setting 
and in the finite block-length regime. A similar phenomenon 
was observed in [15] for the Gilbert-Elliott channel with no 
state information at the transmitter and receiver 
From Figs. 3 and 4, we also observe that T* decreases as 



we shorten the block-length. For example, the rate-maximizing 
channel's coherence time T* for block-length n = 4 x 10^ is 
roughly 64, whereas for block-length n = 4 x 10'^, it is roughly 
28. 
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